University of Delaware Department of Electrical and Computer Engineering Computer Architecture and Parallel Systems Laboratory Executable Performance Model and Evaluation of High Performance Architectures with Percolation

نویسندگان

  • Adeline Jacquet
  • Vincent Janot
  • R. Govindarajan
  • Clement Leung
  • Guang Gao
  • Thomas Sterling
چکیده

Percolation has recently been proposed as a key component of an advanced program execution model for future generation high-end machines featuring adaptive data/code transformation and movement for effective latency tolerance. Percolation is related to conventional prefetch technique, but is more aggressive and smarter. A program unit (e.g. a procedure instance) is not ready to be scheduled for execution until the data it needs is in the right place (close to the code in the memory hierarchy) and in the right form (e.g. proper layout, etc.). Supporting percolation is a major effort in the architecture design and the compiler/runtime software support. An early evaluation of the performance effect of percolation is very important in the design space exploration of future generations of supercomputers. However, performance evaluation of percolation using a traditional approach for computer architecture (e.g. execution-driven or trace-driven simulation) is both time consuming and impractical. Further, in early-stage architecture design/performance evaluation which deals with incomplete design details, or a program execution model with only a (sketchy) conceptual design, or an architecture without an (optimizing) compiler, make simulation-based approaches unsuitable. In this paper, we develop an executable analytical performance model of a high performance multithreaded architecture that supports percolation. A novel feature of our approach is that it models the interaction between the software (program) and hardware (architecture) components. We solve the analytical model using a queuing simulation tool enriched with synchronization. The proposed approach is effective and facilitates obtaining performance trends quickly. Our results indicate that percolation brings in significant performance gains (by a factor of 2.7 to 11) when memory latency ranges from local memory access time to remote memory access time (in a multiprocessor system). Further, our results reveal that percolation and multithreading can complement each other and can work together to tolerate memory latency, especially in a multiprocessor system.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High speed Radix-4 Booth scheme in CNTFET technology for high performance parallel multipliers

A novel and robust scheme for radix-4 Booth scheme implemented in Carbon Nanotube Field-Effect Transistor (CNTFET) technology has been presented in this paper. The main advantage of the proposed scheme is its improved speed performance compared with previous designs. With the help of modifications applied to the encoder section using Pass Transistor Logic (PTL), the corresponding capacitances o...

متن کامل

High Speed Delay-Locked Loop for Multiple Clock Phase Generation

In this paper, a high speed delay-locked loop (DLL) architecture ispresented which can be employed in high frequency applications. In order to design the new architecture, a new mixed structure is presented for phase detector (PD) and charge pump (CP) which canbe triggered by double edges of the input signals. In addition, the blind zone is removed due to the elimination of reset signal. Theref...

متن کامل

Dual Space Control of a Deployable Cable Driven Robot: Wave Based Approach

Known for their lower costs and numerous applications, cable robots are an attractive research field in robotic community. However, considering the fact that they require an accurate installation procedure and calibration routine, they have not yet found their true place in real-world applications. This paper aims to propose a new controller strategy that requires no meticulous calibration and ...

متن کامل

A New Approach to Solve N-Queen Problem with Parallel Genetic Algorithm

Over the past few decades great efforts were made to solve uncertain hybrid optimization problems. The n-Queen problem is one of such problems that many solutions have been proposed for. The traditional methods to solve this problem are exponential in terms of runtime and are not acceptable in terms of space and memory complexity. In this study, parallel genetic algorithms are proposed to solve...

متن کامل

Enhancement of Robust Tracking Performance via Switching Supervisory Adaptive Control

When the process is highly uncertain, even linear minimum phase systems must sacrifice desirable feedback control benefits to avoid an excessive ‘cost of feedback’, while preserving the robust stability. In this paper, the problem of supervisory based switching Quantitative Feedback Theory (QFT) control is proposed for the control of highly uncertain plants. According to this strategy, the unce...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006